Within Cluster Resampling

نویسنده

  • Clarice R. Weinberg
چکیده

ELAINE BORLAND HOFFMAN: Within Cluster Resampling (Under the direction of Clarice R. Weinberg) Dependence among observations from the same group is known as within-cluster correlation. Typically, distinct clusters are independent, but special methods are required to account for the existence of the within-cluster correlation. Although not accounting for the within-cluster correlation may produce consistent parameter estimates, the variances will be under-estimated. Within Cluster Resampling (WCR) is proposed as a method for analyzing any clustered data; however, this dissertation will focus on clustered binary data. The proposed idea is to sample one observation from each cluster. A set of independent observations called a resampled data set results, which simply can be analyzed by standard software. As the name Within Cluster Resampling implies, the above process is repeated by randomly sampling with replacement one observation from each cluster. A series of data sets is created by resampling a large number of times. The regression parameter is estimated by the average of the resample-based estimates. The proposed variance will account for the correlation among observations from the same cluster. The interpretation of a WCR parameter is the population-averaged difference associated with a unit change in a covariate corresponding to a randomly selected observation from a randomly selected cluster. WCR is unlike traditional marginal analysis methods. WCR is a cluster-based method that equally weights each cluster, whereas traditional marginal methods are observation-based and the weight of the cluster increases with its size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Unsupervised Learning to Guide Resampling in Imbalanced Data Sets

The class imbalance problem causes a classier to overt the data belonging to the class with the greatest number of training examples. The purpose of this paper is to argue that methods that equalize class membership are not as e ective as possible when applied blindly and that improvements can be obtained by adjusting for the within-class imbalance. A guided resampling technique is proposed and...

متن کامل

Within-Cluster Resampling for Analysis of Family Data: Ready for Prime-Time?

Hoffman et al. [1] proposed an elegant resampling method for analyzing clustered binary data. The focus of their paper was to perform association tests on clustered binary data using within-cluster-resampling (WCR) method. Follmann et al. [2] extended Hoffman et al.'s procedure more generally with applicability to angular data, combining of p-values, testing of vectors of parameters, and Bayesi...

متن کامل

Resampling Method for Unsupervised Estimation of Cluster Validity

We introduce a method for validation of results obtained by clustering analysis of data. The method is based on resampling the available data. A figure of merit that measures the stability of clustering solutions against resampling is introduced. Clusters that are stable against resampling give rise to local maxima of this figure of merit. This is presented first for a one-dimensional data set,...

متن کامل

Practice of Epidemiology Internal Validation of Risk Models in Clustered Data: AComparison of Bootstrap Schemes

Internal validity of a risk model can be studied efficiently with bootstrapping to assess possible optimism in model performance. Assumptions of the regular bootstrap are violated when the development data are clustered. We compared alternative resampling schemes in clustered data for the estimation of optimism in model performance. A simulation study was conducted to compare regular resampling...

متن کامل

Research on approach for classification of Within imbalanced data sets

Most of the existing methods for unbalanced data classification only consider about the situation of imbalance between classes but don't consider about the situation within the class, thus affect the final classification results. In order to eliminate the imbalance within the class, put forward the cluster algorithms based on DBSACN algorithm to process the imbalance problem within the class. T...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998